LANGUAGE IDENTIFICATION USING G-LDA

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Identification Using G-lda

Language Identification has an important role in Natural Language processing applications as one of the pre-processing steps. There are various mechanisms in use today to achieve this task with brilliant recognition rates. Recent years have seen rapid growth in international communication which has lead to the requirement of systems capable of correctly identifying languages of documents. Possi...

متن کامل

Style & Topic Language Model Adaptation Using HMM-LDA

Adapting language models across styles and topics, such as for lecture transcription, involves combining generic style models with topic-specific content relevant to the target document. In this work, we investigate the use of the Hidden Markov Model with Latent Dirichlet Allocation (HMM-LDA) to obtain syntactic state and semantic topic assignments to word instances in the training corpus. From...

متن کامل

Style And Topic Language Model Adaptation Using HMM-LDA

Adapting language models across styles and topics, such as for lecture transcription, involves combining generic style models with topic-specific content relevant to the target document. In this work, we investigate the use of the Hidden Markov Model with Latent Dirichlet Allocation (HMM-LDA) to obtain syntactic state and semantic topic assignments to word instances in the training corpus. From...

متن کامل

Language Identification using Classifier Ensembles

In this paper we describe the language identification system we developed for the Discriminating Similar Languages (DSL) 2015 shared task. We constructed a classifier ensemble composed of several Support Vector Machine (SVM) base classifiers, each trained on a single feature type. Our feature types include character 1–6 grams and word unigrams and bigrams. Using this system we were able to outp...

متن کامل

Automatic language identification using wavelets

Spoken language identification consists in recognizing a language based on a sample of speech from an unknown speaker. The traditional approach for this task mainly considers the phonothactic information of languages. However, for marginalized languages –languages with few speakers or oral languages without a fixed writing standard–, this information is practically not at hand and consequently ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Research in Engineering and Technology

سال: 2013

ISSN: 2321-7308,2319-1163

DOI: 10.15623/ijret.2013.0211008